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SUMMARY. In February 2003, a severe acute respiratory syndrome coronavirus (SARS-CoV) emerged in humans in Guangdong 
Province, China, and caused an epidemic that had severe impact on public health, travel, and economic trade. Coronaviruses are 
worldwide in distribution, highly infectious, and extremely difficult to control because they have extensive genetic diversity, a short 
generation time, and a high mutation rate. They can cause respiratory, enteric, and in some cases hepatic and neurological diseases in 
a wide variety of animals and humans. An enormous, previously unrecognized reservoir of coronaviruses exists among animals. 
Because coronaviruses have been shown, both experimentally and in nature, to undergo genetic mutations and recombination at 
a rate similar to that of influenza viruses, it is not surprising that zoonosis and host switching that leads to epidemic diseases have 
occurred among coronaviruses. 

Analysis of coronavirus genomic sequence data indicates that SARS-CoV emerged from an animal reservoir. Scientists examining 
coronavirus isolates from a variety of animals in and around Guangdong Province reported that SARS-CoV has similarities with 
many different coronaviruses including avian coronaviruses and SARS-CoV-like viruses from a variety of mammals found in live- 
animal markets. Although a SARS-like coronavirus isolated from a bat is thought to be the progenitor of SARS-CoV, a lack of 
genomic sequences for the animal coronaviruses has prevented elucidation of the true origin of SARS-CoV. Sequence analysis of 
SARS-CoV shows that the 5’ polymerase gene has a mammalian ancestry; whereas the 3’ end structural genes (excluding the spike 
glycoprotein) have an avian origin. Spike glycoprotein, the host cell attachment viral surface protein, was shown to be a mosaic of 
feline coronavirus and avian coronavirus sequences resulting from a recombination event. Based on phylogenetic analysis designed to 
elucidate evolutionary links among viruses, SARS-CoV is believed to have branched from the modern Group 2 coronaviruses, 
suggesting that it evolved relatively rapidly. This is significant because SARS-CoV is likely still circulating in an animal reservoir 
(or reservoirs) and has the potential to quickly emerge and cause a new epidemic. 


RESUMEN. Estudio Recapitulativo por Invitacién—Relacion del coronavirus causante del sindrome respiratorio agudo severo 
con coronavirus aviares y otros coronavirus. 

En Febrero del afio 2003, en la provincia de Guangdong, China emergié en humanos un coronavirus causante del sindrome 
respiratorio agudo severo (por sus siglas en Inglés SARS-CoV) que causé una epidemia con un impacto severo en la salud publica, los 
viajes y el intercambio comercial. Los coronavirus tienen distribucién mundial, son altamente infecciosos y extremadamente dificiles 
de controlar porque poseen alta diversidad genética, periodos cortos de regeneracién y una alta tasa de mutacién. Pueden causar 
enfermedades respiratorias, entéricas y en algunos casos enfermedades hepaticas y neuroldgicas en una amplia variedad de animales y 
en humanos. Entre los animales existe un enorme reservorio de coronavirus no reconocido con anterioridad. Debido a que se ha 
demostrado experimentalmente y en la naturaleza que los coronavirus son capaces de sufrir mutaciones y recombinaciones genéticas 
a una tasa similar a la de los virus de influenza, no es sorprendente que el cambio de huésped y la zoonosis que conlleva 
a enfermedades epidémicas hayan ocurrido entre los coronavirus. El andlisis de la secuencia del genoma del SARS-CoV indica que 
este emergié de un reservorio animal. Analizando los aislamientos de coronavirus provenientes de una variedad de animales dentro 
y en los alrededores de la provincia de Guandong, los cientificos reportaron que SARS-CoV tiene similitudes con diferentes 
coronavirus incluyendo coronavirus aviares y virus parecidos a SARS-CoV provenientes de una variedad de mamiferos hallados en 
mercados de animales vivos. Aun cuando se piensa que un virus parecido al SARS-CoV aislado en un murciélago es el progenitor del 
SARS-CoV, la inexistencia de secuencias de los coronavirus de este animal ha impedido elucidar el verdadero origen del SARS-CoV. 
El andlisis de la secuencia del SARS-CoV muestra que el gen 5’ de la polimerasa tiene un ancestro mamifero, mientras que los genes 
estructurales del extremo 3’ (excluyendo la glicoproteina de la espicula) tienen un origen aviar. Se ha demostrado que la glicoproteina 
de la espicula, que es la proteina viral de superficie que permite el contacto con el huésped, es un mosaico de secuencias de coronavirus 
felinos y coronavirus aviares resultante de un evento de recombinacién. Basandose en anialisis filogenéticos disefados para elucidar las 
interrelaciones evolutivas entre virus, se cree que el SARS-CoV se separé del grupo 2 de coronavirus modernos, sugiriendo 
que evolucioné relativamente rapido. Esto es muy significativo porque es probable que el SARS-CoV aun esté circulando en un 
reservorio o reservorios animales y tiene el potencial de emerger rapidamente y causar una nueva epidemia. 


Key words: severe acute respiratory syndrome, avian coronavirus, phylogenetic relationship, molecular evolution, mutation, 
recombination, animal reservoir, emergence, SARS-CoV, infectious bronchitis virus, turkey coronavirus 


Abbreviations: BCoV = bovine coronavirus; E = envelope; FIPV = feline infectious peritonitis virus; HCoV = human 
coronavirus; HE = hemagglutinin-esterase; IBV = infectious bronchitis virus; M = membrane; MHV = mouse hepatitis virus; 
N = nucleocapsid; PEDV = porcine epidemic diarrhea virus; RBD = receptor binding domain; RdRp = RNA-dependent RNA- 
polymerase; S = spike; SARS = severe acute respiratory syndrome; SARS-CoV = severe acute respiratory syndrome coronavirus; 
TCoV = turkey coronavirus; TGEV = transmissible gastroenteritis virus 
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Intensive animal agriculture practices, human population growth, 
and cultural habits and customs have put humans in close contact with 
animal reservoirs of viruses that have the potential to cause zoonotic 
diseases. In February 2003, severe acute respiratory syndrome (SARS) 
in humans was reported in Guangdong Province, the People’s 
Republic of China. The etiological agent of SARS was quickly 
identified as a newly emerged coronavirus, but not before the disease 
spread to over 24 countries in only a few months, infecting 8098 
people worldwide and killing 774 (World Health Organization, 
http://www.who.int/csr/sars/en/; 10,30,36). It is widely believed that 
the SARS coronavirus (SARS-CoV) originated from an animal 
reservoir, but the true origin of the virus is still unknown (23,28). 
There is a newly recognized reservoir of coronaviruses that exists 
among animals. Because coronaviruses have been shown, both 
experimentally and in nature, to undergo genetic recombination by 
a genomic template-switching mechanism and to generate genetic 
point mutations at a rate similar to that of other RNA viruses includ- 
ing influenza A viruses, it is not surprising that zoonosis and host 
switching leading to epidemic diseases occur among coronaviruses. It 
should be noted that coronaviruses among wildlife and domestic 
animals can also pose a threat to the health of commercial poultry. 

Many questions regarding coronavirus origin, evolution, and 
genetic diversity can be answered in part from phylogenetic and 
evolutionary analysis of genome sequence data. Since the first report 
of SARS-CoV, scientists have been examining coronavirus isolates 
from a variety of animals in and around Guangdong Province in an 
attempt to identify natural host reservoirs and determine their roles 
in propagation, evolution, and transmission of these viruses. Geno- 
mic sequence analysis of SARS-CoV shows similarities with many 
different coronaviruses including avian coronaviruses and SARS- 
CoV-like viruses from a variety of mammals found in live animal 
markets in China (22,39). However, a lack of genomic sequences for 
animal coronaviruses has hindered efforts to identify the true origin 
of the SARS-CoV. That information is essential if we are to prevent 
future coronavirus outbreaks. 

Coronaviral diseases. Coronaviruses are worldwide in dis- 
tribution and highly infectious by nature. They cause respiratory, 
enteric, and in some cases hepatic and neurological diseases in a wide 
variety of animals and humans (25). The diseases can be acute or 
chronic and can be transmitted by respiratory or enteric routes. Most 
coronaviruses replicate in the epithelial cells of the upper respiratory 
tract and the enteric tract causing respiratory disease and diarrhea. 
Other sites of infection in the host can include the kidney, re- 
productive tract, liver, spleen, thymus, and brain (18,25). 

The disease caused by the SARS-CoV is characterized by a lower 
respiratory tract infection accompanied by high fever, headache, loss 
of appetite, and diarrhea in about 10%-20% percent of patients 
(1,34,35). At the onset, a mild respiratory disease is observed with 
a dry cough that usually develops into pneumonia after 7 days (19). 
The SARS-CoV is primarily transmitted by respiratory aerosol but 
the virus is also shed in the feces, making fecal—oral transmission 
a possibility (48). The virus can infect nonhuman primates, civet 
cats, domestic cats, ferrets, mice, and golden Syrian hamsters, 
whereas pigs and chickens are refractory to infection (42,47,48). 

The avian coronaviruses, infectious bronchitis virus (IBV) and 
turkey coronavirus (TCoV), are known to cause a respiratory disease 
in chickens and enteric disease in turkeys, respectively (8,15). 
Infectious bronchitis in chickens is a mild disease characterized by 
watery eyes, catarrhal tracheitis, swollen sinuses, sneezing, coughing, 
and tracheal rales. Turkey coronavirus causes a severe diarrhea in 
poults less than 4 wk of age and has been associated with poult en- 
teritis and mortality syndrome (16). The clinical signs can include 
depression, anorexia, and dehydration as a result of a watery diar- 
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Fig. 1. Gene organization for the three groups of coronaviruses and 
for SARS-CoV. The gray boxes depict open reading frames coding for 
known viral proteins and are showing gene order only, they are not to 
scale. A” = polyA tail. 


thea, which can contain urates and mucus. The lower intestines 
including the ceca are often thin-walled and pale (14). Chickens are 
the only animals naturally infected by IBV, and TCoV infects 
turkeys of all ages and possibly chickens, but only young turkeys 
develop clinical disease (6,14). 

Coronaviruses. Coronaviruses are enveloped viruses contain- 
ing the largest (~28 to 30 kb) single-stranded positive-sense RNA 
genome known (18). The shape of the virion is pleomorphic and 
thus varies in diameter from 80 nm to 100 nm. The major proteins 
encoded by the viral genome are, in order, starting at the 5’ end, the 
viral polymerase (open reading frames 1a and 1b), hemagglutinin- 
esterase (HE, in some viruses), the spike (S) glycoprotein located on 
the surface of the virus, a small envelope protein (E), an integral 
membrane glycoprotein (M), and the nucleocapsid protein (N), 
which is closely associated with the viral RNA. Several nonstructural 
and regulatory proteins are also encoded by the viral genome (see 
Fig. 1). 

Coronaviruses belong to the family Coronaviridae in the 
Nidovirales order (11). Other families in the Nidovirales order are 
Arteviridae and Roniviridae, which include viruses that infect swine 
and equine, and viruses that infect invertebrates, respectively. Coro- 
naviruses are divided into three groups based on antigenicity and 
genetic structure (see Fig. 1) (25). Group I viruses include, among 
others, transmissible gastroenteritis virus (TGEV) in pigs, feline 
infectious peritonitis virus (FIPV), canine coronavirus, human 
coronavirus (HCoV) strain 229E, and porcine epidemic diarrhea 
virus (PEDV). The viruses in Group I do not have an HE protein; 
the M protein is N-glycosylated and the S-glycoprotein is not 
cleaved. Some examples of Group II coronaviruses are mouse hepa- 
titis virus (MHV), bovine coronavirus (BCoV), and HCoV strain 
OC43. Most of the coronavirus in Group II have an HE protein; the 
M protein is O-glycosylated and the S-glycoprotein is cleaved into 
two subunits. The Group III coronaviruses include the avian 
coronaviruses IBV, TCoV, and pheasant coronavirus. Viruses in 
Group III do not have an HE protein; the M protein is N- 
glycosylated and the S-glycoprotein is cleaved. Based on sequence 
analysis of the polymerase protein, the SARS-CoV is currently 
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placed as a distant member of the Group II coronaviruses. However, 
like Group III avian coronaviruses, it does not have an HE protein, 
and the SARS-CoV membrane protein is N-glycosylated and the 
S-glycoprotein was shown to be cleaved in vitro (44,51). It is also 
interesting to note that the gene organization of the 3’ end of the 
SARS-CoV genome is most similar to the Group III avian viruses 
(Fig. 1). 

The most studied coronavirus structural protein is the S- 
glycoprotein. The S-glycoprotein forms club-shaped projections 
on the surface of the virus particles. It is anchored in the viral enve- 
lope, and in Group II and III coronaviruses, it is posttranslationally 
cleaved by host cell serine proteases into two subunits designated $1 
and S2. The S-glycoprotein mediates host cell attachment and entry. 
Generally, coronaviruses bind to specific host cell receptors, and 
several have been identified. The SARS-CoV binds to angiotensin- 
converting enzyme 2, whereas the avian coronaviruses utilize sialic 
acid alpha 2,3 linked to galactose and possibly a secondary receptor 
protein, aminopeptidase N, or other protein, to attach and enter 
cells (25,27,31,49,50). When the S-glycoprotein binds to its specific 
cell receptor it undergoes a conformation change involving two 
heptad repeat regions that brings the S-glycoprotein fusion peptide 
in close proximity to the viral transmembrane segment, which facil- 
itates membrane fusion and entry into the cell (2). 

The S-glycoprotein plays a role in the pathogenesis of the virus 
(3,25,48). A receptor-binding domain (RBD) within the amino 
terminus of the S-glycoprotein forms part of the globular head of the 
mature protein. The RBD of SARS-CoV (residues 318-510), maps 
to the hypervariable region III in IBV (residues 274-387), which is 
associated with neutralizing epitopes on the S-glycoprotein (32,50). 
Changes in the S-glycoprotein and specifically in the RBD can 
alter host cell specificity and mediate shifts in viral pathogenesis 
(17,24,37). Replacing the S-glycoprotein gene of an attenuated 
respiratory strain of TGEV with the S-glycoprotein from a patho- 
genic enteric strain changed the respiratory strain into a pathogenic 
enterotropic virus in pigs (37). Changes in the S-glycoprotein can 
also lead to host switching. Using targeted recombination, the MHV 
S-glycoprotein gene was substituted for the FIPV S-glycoprotein and 
the recombinant feline MHV containing the feline S-glycoprotein 
was now capable of infecting feline cells and lost its ability to infect 
murine cells (24). A host shift was also observed when TCoV, which 
is closely related to IBV, acquired a S-glycoprotein gene of unknown 
origin allowing it to emerge and cause enteric disease in turkeys (20). 
However, pathogenicity does not appear to be solely related to the S- 
glycoprotein in some coronaviruses. For example, when the ecto- 
domain of the S-glycoprotein gene in the nonpathogenic Beaudette 
strain of IBV was replaced with the S-glycoprotein from a pathogenic 
strain of the same serotype (Mass 41), no difference in pathogenicity 
was observed between the recombinant and the parental strain (17). 

Viral replication and genetic diversity. Coronaviruses 
have a short generation time and a high mutation rate, which 
provides the virus with extensive genetic diversity making it ex- 
tremely difficult to control. To understand how genetic diversity is 
achieved in coronaviruses it is necessary to know how the virus 
replicates. After attachment and entry into the cell, the positive-sense 
viral genome acts as a messenger RNA (mRNA) for the transcription 
of the viral RNA-dependent RNA-polymerase (RdRp). Then, using 
a leader-primed mRNA synthesis mechanism, the polymerase gen- 
erates a 3’ coterminal nested set of subgenomic-sized mRNAs that 
code for the other viral proteins. The polymerase also replicates the 
full-length viral genome by the same mechanism. During this pro- 
cess, the polymerase can generate genetic diversity by two means. 
First, the RdRp does not have proofreading capabilities, so it cannot 
fix mistakes made while copying the viral genome. The RdRp is 
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estimated to generate mutations at a rate of 5.7 X 10° nucleotide 
substitutions per site per day or 0.17 mutations per genome per day, 
which is similar to rates reported for avian influenza viruses (40,43). 
Second, the RdRp can also produce genetic diversity by a template- 
switching mechanism (5). When two or more different strains of 
coronavirus enter the same cell, recombination can occur as a result 
of switching templates from one viral genome to another. In this 
way, whole genes or genome segments can be exchanged between 
viruses. This was shown to occur in vaccine strains in the field (21) 
and in a natural outbreak of IBV in Texas where a “hot spot” for 
recombination was identified in the S1 gene (45,46). A recent ex- 
ample of template switching occurred in the avian coronaviruses and 
led to the emergence of TCoV. Analysis of the TCoV S-glycoprotein 
gene showed the S1 portion to be genetically unique with a putative 
recombination crossover site identified in the 5’ end of the S2 gene 
(20). The other genes (3ab, M, 5ab, and N) downstream of the 
crossover site, including the majority of the S2 gene, were nearly 
identical to IBV (4,7,14,20). The ancestor of TCoV is clearly IBV, 
but the origin of the TCoV S-glycoprotein gene, which allowed that 
virus to emerge and cause disease in turkeys, is not known. 

SARS-CoV genetic similarity with IBV, TCoV, and other 
coronaviruses. Recombination events contribute to the evolu- 
tion and emergence of coronaviruses by creating mosaic viruses. A 
BLAST analysis (www.ncbi.nlm.nih.gov/BLAST) as well as phylo- 
genetic and recombination studies of the SARS-CoV genome 
showed similarities with IBV, BCoV, HCoV (229E), MHV, PEDV, 
and TGEV (52). Stavrinides and Guttman (39), using Bayesian, 
neighbor-joining and split decomposition phylogenetic analysis of 
the full-length SARS-CoV genome showed that the polymerase 
protein (5’ region of the genome) had a mammalian ancestry and 
was most closely related to the Group II coronaviruses. In addition, 
they found evidence that the M and N proteins (3’ region of the 
genome) had an avian coronavirus origin, which was also shown by 
Marra et al. (30). Because the S-glycoprotein mediates cell attach- 
ment and is responsible for host specificity, it was interesting that the 
SARS-CoV S gene was found to be a mosaic containing both avian 
infectious bronchitis virus (Group III) and feline infectious perito- 
nitis virus (Group I) sequences (39). The feline sequences, evident in 
the first 600 residues, are interesting because palm civet cats appear 
to play a role in the epidemiology of this disease. The IBV-related 
sequences were evident between residues 601 and 667, suggesting 
that a recombination event occurred in the middle of the S- 
glycoprotein gene, which may have contributed to the observed shift 
in host range. It is unclear which animal may have been the inter- 
mediary host. 

Phylogenetic studies on TCoV conducted in our laboratory 
showed that sequences in the TCoV S-glycoprotein were also found 
in the SARS-CoV (20). Similarities with TCoV appear to be rela- 
tively distant and are located in the heptad repeat regions in the $2 
subunit. Two heptad repeat regions are commonly found in many 
class I virus fusion proteins such as avian influenza hemagglutinin, 
paramyxovirus fusion protein, and coronavirus S-glycoprotein. 
Those regions are conserved structural features involved in protein 
folding that facilitates viral and cellular membrane fusion, and sim- 
ilarities among them likely do not indicate a direct parental link (9). 
Thus, sequence similarities between SARS-CoV and other corona- 
viruses, including avian coronaviruses indicative of virus origin, need 
to be carefully considered in the context of functionally conserved 
regions within the viral genome. 

SARS origin and evolution. Initially it was reported that 
SARS-CoV resulted from a zoonotic shift in a coronavirus from the 
palm civet cat (Parguma larvata) and/or the raccoon dog (Nyctereutes 
procyonoides) (22). But, phylogenetic analysis of recent sequence data 
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indicates that a SARS-like bat coronavirus could be the progenitor of 
SARS-CoV (28). Animals such as palm civet cats, raccoon dogs, and 
fruit bats, commonly found in the market place in China, are in 
close contact with people and are likely the source of the SARS-CoV. 
Unfortunately, the true origin of SARS-CoV is still not clear. Serol- 
ogy, epidemiology, and pathogenicity studies along with phyloge- 
netic analysis of genomic sequence data for a wide variety of animal 
coronaviruses will be needed before the evolutionary origin of 
SARS-CoV can be identified. 

A tremendous amount of work has recently been conducted on 
generating, organizing, and analyzing the sequence data for the 
SARS-CoV, but similar analysis have not been conducted for the 
animal coronaviruses. Using the Molecular Evolutionary Genetics 
Analysis program (MEGA 3.1, www.megasoftware.net), we gener- 
ated a phylogenetic tree (Fig. 2) from alignments for all the available 
animal coronavirus whole genomic sequences listed by the Patho- 
Systems Resource Integration Center (https://patric.vbi.vt.edu/) and 
available in GenBank (www.ncbi.nlm.nih.gov) as well as selected 


SARS-CoV isolates. The Nei-Gojobori (p-distance) analysis shows 
evolutionary trends among those viruses. Two major lineages are 
generated with human SARS-CoV isolates branching from the 
Group II coronaviruses as described in the literature (38). In addi- 
tion, the bat SARS-like coronaviruses (HKU3-1 and Rp3) and 
a civet cat (010) isolate groups with two early human SARS-CoV 
isolates (Tor2 and Urbani) in lineage 2. The other animal-origin 
SARS-like viruses are in lineage 1 indicating that they are likely not 
progenitors of SARS-CoV. 

As discussed above, the SARS-CoV has sequence similarities to 
a number of coronaviruses indicating that recombination played 
a role in its origin; however, the SARS-CoV likely did not emerge 
from a recombination event between mammalian and avian coro- 
naviruses. It appears that it emerged because of genetic drift, which is 
the accumulation of genetic mutations over time. A detailed analysis 
of signature variation residues within the S-glycoprotein from 
a number of SARS-CoV isolates obtained from palm civets at the 
live-animal market where the SARS-CoV emerged showed an 
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accumulation of mutations over a 2-yr period leading to the epi- 
demic group of viruses that caused disease in humans (22). Signature 
variation residues are unique changes in a gene that can be used to 
trace virus sources and track virus evolution within a defined envi- 
ronment. It is likely that the palm civet is an intermediate host (if it 
is assumed that bats are the reservoir for the virus), which con- 
tributed to rapid evolution of the virus and subsequent jump to 
humans. Accelerated positive Darwinian selection following a shift 
to an intermediate host is well known for avian influenza A viruses 
(13,26,33,40,41) and most likely also occurs in coronaviruses. 

Phylogenetic analysis of the RdRp gene showed that the SARS- 
CoV is a close relative of the Group II coronaviruses (23,29,38). 
And, it is now widely believed that SARS-CoV branched from 
the modern Group 2 coronaviruses, and does not represent a new 
coronavirus group, which would suggest a recent ancestor for the 
virus (12). This is significant because it indicates that the virus 
evolved relatively rapidly to cause disease in humans and could con- 
tinue to evolve in an animal reservoir, potentially causing another 
SARS epidemic in the future. 

Based on information from the World Health Organization 
(www.who.int/csr/sars/postoutbreak/en/) the SARS epidemic was 
officially controlled on July 5, 2003. Although the SARS-CoV does 
not appear to be currently circulating in people, it is likely being 
maintained in an animal reservoir(or reservoirs) and could poten- 
tially emerge to cause a new epidemic. As in avian influenza virus, 
genetic mutations and exchange of genetic information among 
coronaviruses leads to the emergence of new viruses capable of 
infecting and causing disease in animals and humans. The potential 
of animal coronaviruses to serve as a genetic reservoir for human 
disease makes it extremely important to identify the genes and 
genomes among coronaviruses in animals, especially those animals in 
close contact with people. Only then will it be possible to monitor 
reservoirs of the virus, which along with a vigilant global public 
health surveillance program, is essential for the future control of 
this disease. 
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